Skip to content

Comments

Fix over_rollout#500

Merged
pan-x-c merged 4 commits intoagentscope-ai:mainfrom
luyi256:fix_overrollout
Feb 11, 2026
Merged

Fix over_rollout#500
pan-x-c merged 4 commits intoagentscope-ai:mainfrom
luyi256:fix_overrollout

Conversation

@luyi256
Copy link
Collaborator

@luyi256 luyi256 commented Feb 9, 2026

Description

Fix over_rollout: clear all the unfinished tasks after finishing the min_num tasks, instead of ignoring these tasks, which will consume resources until they are finished.

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has passed all tests
  • Docstrings have been added/updated in Google Style
  • Documentation has been updated
  • Code is ready for review

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @luyi256, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a resource consumption issue in the task scheduler where unfinished tasks would continue to occupy resources even after a batch had met its minimum completion requirements and a grace period. By introducing a dedicated cleanup mechanism, the changes ensure that these lingering tasks are properly terminated and their associated runners are reset, leading to more efficient resource utilization and preventing potential resource leaks.

Highlights

  • Resource Management Improvement: Introduced a new private helper method, _cleanup_batch_and_restart_runners, to centralize the logic for clearing timed-out tasks and restarting their associated runners for a given batch.
  • Over-Rollout Fix: Modified the get_results method to invoke this new cleanup logic when the over_rollout.wait_after_min condition is met, ensuring that any remaining unfinished tasks are properly cleared and resources are released after the minimum task threshold is reached and a waiting period has passed.
  • Code Refactoring: Replaced duplicated task cleanup and runner restart logic within the get_results method's general timeout handling with a call to the newly introduced _cleanup_batch_and_restart_runners method, improving code maintainability.
Changelog
  • trinity/explorer/scheduler.py
    • Added a new private method _cleanup_batch_and_restart_runners to handle the clearing of timed-out tasks and restarting of runners for a specific batch.
    • Updated the get_results method to call _cleanup_batch_and_restart_runners when the over_rollout.wait_after_min condition is met, ensuring proper resource cleanup.
    • Refactored the existing timeout handling within get_results to utilize the new _cleanup_batch_and_restart_runners method, reducing code duplication.
Activity
  • No specific activity (comments, reviews, or progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the logic for cleaning up unfinished tasks in over_rollout scenarios. A new method _cleanup_batch_and_restart_runners is introduced to encapsulate the cleanup logic, which is now called when a batch times out or when the over_rollout condition is met after reaching the minimum number of completed tasks. This fixes an issue where unfinished tasks would continue to consume resources.

My review focuses on improving the robustness of the new cleanup method by handling potential exceptions during runner restarts and ensuring the asynchronous calls are correctly awaited. I've also suggested a minor performance improvement in the get_results method.

@luyi256
Copy link
Collaborator Author

luyi256 commented Feb 9, 2026

/unittest-module-common

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
55 54 0 1 0 0 10m 30s

Skipped

Tests Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid 9h 22m
tests/common/config_test.py::TestConfig::test_chat_template_path 5m
tests/common/config_test.py::TestConfig::test_config_flatten 33.1s
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid 6m 24s
tests/common/config_test.py::TestConfig::test_default_workflow 4m 59s
tests/common/config_test.py::TestConfig::test_load_default_config 1h 30m
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly 5m 4s
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation 5m 5s
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster 30m 58s
tests/common/experience_test.py::TestEID::test_eid_properties 524ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type 481ms
tests/common/experience_test.py::TestExperience::test_assertions 325ms
tests/common/experience_test.py::TestExperience::test_dpo_experience 393ms
tests/common/experience_test.py::TestExperience::test_gather 909ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward 579ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion 15.3s
tests/common/experience_test.py::TestExperience::test_multi_turn_experience 346ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize 2.0s
tests/common/experience_test.py::TestExperience::test_single_turn_experience 367ms
tests/common/experience_test.py::TestExperience::test_to_dict 313ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion 667ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion 526ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion 795ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields 476ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion 573ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution 859ms
tests/common/sudoku_test.py::test_9x9_generator_creates_holes 618ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled 942ms
tests/common/sudoku_test.py::test_judge_allows_incomplete_board 242ms
tests/common/sudoku_test.py::test_judge_detects_row_violation 229ms
tests/common/sudoku_test.py::test_judge_detects_column_violation 216ms
tests/common/sudoku_test.py::test_judge_detects_block_violation 219ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution 273ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled 253ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation 230ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation 216ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate 16h
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate 10h 45m
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate 10h 44m
tests/common/vllm_test.py::TestModelLen_0::test_model_len 8h 56m
tests/common/vllm_test.py::TestModelLen_1::test_model_len 7h
tests/common/vllm_test.py::TestModelLen_2::test_model_len 7h 43m
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len 7h 37m
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation 7h 30m
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status 7h 32m
tests/common/vllm_test.py::TestAPIServer::test_api 7h 56m
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api 7h 41m
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async 8h 4m
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async ⏭️ 630ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask 4m 15s
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools 3m 57s
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls 8h 42m
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls 8h 37m
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate 25h 54m
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api 11h 12m

Github Test Reporter by CTRF 💚

@pan-x-c
Copy link
Collaborator

pan-x-c commented Feb 9, 2026

/unittest-module-explorer

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
49 49 0 0 0 0 13m 2s

Tests

Test Name Status Flaky Duration
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 31h 44m
tests/explorer/explorer_test.py::TestExplorerEvalDetailedStats::test_explorer 20h 30m
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer 15h 21m
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer 50h 16m
tests/explorer/explorer_test.py::ServeTest::test_serve 14h 59m
tests/explorer/proxy_test.py::RecorderTest::test_recorder 1m 8s
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow 1h 24m
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 1h 24m
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout 3h 33m
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 5h 33m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0 1h 19m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1 1h 19m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0 1h 18m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1 1h 23m
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution 1h 27m
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow 1h 26m
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait 2h 28m
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 4h 6m
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 2h 29m
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 2h 16m
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid 6h 58m
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 2h 13m
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 3h 48m
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection 2h 49m
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0 1.5s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1 10m 2s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0 917ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1 16m 42s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error 883ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps 16m 42s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 27.1s
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 16.7s
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 7m 30s
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow 4.1s
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 11.6s
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 7.7s
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0 785ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1 1m 42s
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0 989ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1 3m 21s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow 6h 30m
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow 6h 30m
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording 1h 6m
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v0 12m 29s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1 14.5s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner 2m 17s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state 2h 14m
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai 7h 30m
tests/explorer/workflow_test.py::TestConcurrentWorkflowRunner::test_concurrent_workflow_runner 10h 46m

Github Test Reporter by CTRF 💚

弈路 added 2 commits February 9, 2026 22:46
…led; fix schedule: wait for cleanup if timeout and enabling clear (also for over_rollout)
@luyi256
Copy link
Collaborator Author

luyi256 commented Feb 10, 2026

/unittest-module-explorer

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
49 49 0 0 0 0 13m 33s

Tests

Test Name Status Flaky Duration
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 1m 56s
tests/explorer/explorer_test.py::TestExplorerEvalDetailedStats::test_explorer 1m 22s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer 53.8s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer 3m
tests/explorer/explorer_test.py::ServeTest::test_serve 1m
tests/explorer/proxy_test.py::RecorderTest::test_recorder 60ms
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow 5.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 4.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout 13.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 29.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0 5.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1 4.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0 4.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1 5.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution 5.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow 5.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait 13.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 15.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 9.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 8.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid 25.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 7.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 13.4s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection 10.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1 602ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1 1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps 1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 26ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 16ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 128ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow 3ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 10ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 7ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1 100ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1 201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow 23.4s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow 23.4s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording 4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v0 743ms
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1 13ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner 138ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state 8.0s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai 26.8s
tests/explorer/workflow_test.py::TestConcurrentWorkflowRunner::test_concurrent_workflow_runner 38.9s

Github Test Reporter by CTRF 💚

@pan-x-c pan-x-c merged commit 9aefc27 into agentscope-ai:main Feb 11, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants